Deriving Chronological Information from Texts through a Graph-Based Algorithm
نویسنده
چکیده
We propose a method of deriving chronological order of events in natural language texts by constraining temporal boundaries associated to events and projecting them on a timeline. The algorithm for constraining event temporal boundaries employs deductive inferences on graph structures that encode temporal information extracted from texts. To this end, we consider the task of extracting events from texts. The results obtained show improvement over the previous event extraction systems when evaluated on the same corpus. Introduction When a story is written, multiple events are described. They occur at different times and, because of this, each topic has its own chronology. Our algorithm of deriving chronological order of events builds graphical representations of temporal elements extracted from texts and, based on the temporal relations that exist between events and time expressions in graphs, it adjusts the temporal boundaries associated to each event. In our implementation, we used the temporal elements annotated in TimeBank, a corpus with TimeML annotations. TimeML (Pustejovsky et al. 2005) is a specification language for annotating time expressions, events, and temporal relations holding between these elements. Placing events on a timeline is useful in various applications such as question answering, information extraction and multidocument summarization. In question answering, for instance, the timeline can play the role of a temporal index of events over the entire document collection. Using this functionality, a question answering system not only is able to answer simple questions such as what event(s) happened at a specific time interval, but it also can answer complex temporal questions that require solving the temporal order of events across different documents. Algorithm for Projecting Events on a Timeline (Step1) Building a Graph-based Representation: We define a time-event graph associated to a text as a temporal representation of events and time expressions encoded in the text. In this graph, the nodes correspond to events and time expressions while the directed edges correspond to TLINK Copyright c © 2007, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. relations. The TLINK (or Temporal Link) relations represent a class of temporal relations defined in TimeML that can link two events or an event and a time expression. The idea of representing temporal information using graph structures is not new. (Allen 1983) proposed as framework for reasoning with time an algebra of temporal intervals, in which the elements of the algebra represent a disjunction of 13 relations1 between temporal intervals. Allen also proposed a constraint propagation algorithm for determining the deductive closure of the temporal relations. In order to avoid difficult search problems along temporal chains of nodes, we convert all the TLINK relations of time-event graphs into Allen’s relations as described in (Verhagen 2005), and then we applied a simplified version of his closure algorithm. (Step2) Transforming Interval-based Temporal Relations into Point-based Relations: We reduce the problem of finding chronological order of events to the problem of assigning temporal intervals to events in a text. Therefore, we associate to every event e in a time-event graph a temporal interval by specifying its starting and ending points, denoted as S(e) and E(e) respectively. The purpose of this algorithm is to approximate S(e) and E(e) as accurate as possible, and, in the same time, to preserve all temporal constraints in which the event e is involved. In order to independently constrain these temporal boundaries, we transform the interval-based relations from timeevent graphs into point-based relations. For this operation, we consider the framework proposed by (Vilain, Kautz, & van Beek 1990) using the set of binary relations {<,=, >} defined in time point algebra. After this process is executed, every event node e that is related to a time node t will have associated a set of point-based temporal relations R(et). (Step3) Deriving Temporal Boundaries of Events: We present a constraint propagation algorithm for deriving temporal boundaries of events. The algorithm (illustrated in Figure 1) is applied to every event node e from a time-event graph. It consists of two main steps: an expansion step and a contraction step. In the expansion step, because a disjunction of convex temporal relations can exist between an event e and a time t, the interval et associated to the The Allen’s 13 relations are: BEFORE (<), AFTER (>), MEETS (M), MET-BY (MI), OVERLAPS (O), OVERLAPPED-BY (OI), STARTS (S), STARTED-BY (SI), DURING (D), CONTAINS (DI), FINISHES (F), FINISHED-BY (FI), EQUALS (=).
منابع مشابه
Heuristics for Recovery from Residual Ambiguity and Incongruity in the Semantic Interpretations of Texts
In this paper we present the heuristic algorithms at the core of a text meaning analyzer based on the ontological-semantic approach (e.g., Nirenburg and Raskin 2004). The full analyzer takes as input natural language texts and through many levels of analysis produces text meaning representations (TMRs) formulated in a specially developed ontology-based metalanguage that covers semantic as well ...
متن کاملAn Effective Path-aware Approach for Keyword Search over Data Graphs
Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...
متن کاملUsing a Fuzzy Rule-based Algorithm to Improve Routing in MPLS Networks
Today, the use of wireless and intelligent networks are widely used in many fields such as information technology and networking. There are several types of these networks that MPLS networks are one of these types. However, in MPLS networks there are issues and problems in the design and implementation discussion, for example security, throughput, losses, power consumption and so on. Basically,...
متن کاملChronological Ordering Based on Context Overlap Detection
Context processing plays an important role in different Natural Language Processing applications. Sentence ordering is one of critical tasks in text generation. Following the same order of sentences in the row sources of text is not necessarily to be applied for the resulted text. Accordingly, a need for chronological sentence ordering is of high importance in this regard. Some researches follo...
متن کاملBiogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization
Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007